AITopics | latent dimension

This paper proposes StrTransformer, a source-wise structured Transformer framework for blind source recovery and branch-wise latent modeling. Instead of using an encoder to infer latent variables, StrTransformer directly optimizes the latent source matrix together with an observation-space mixer and source-wise structural Transformer branches. The mixer enforces reconstruction consistency, while each Transformer branch imposes a differentiable structural constraint on one latent source trajectory. Specifically, each source is converted into multi-scale patch tokens, randomly masked, processed by a locality-biased Transformer, and evaluated through a masked patch reconstruction energy. This energy acts as an implicit source-wise structural prior. To encourage different latent branches to specialize into different temporal regimes, StrTransformer further introduces an ordered multi-scale controller that learns branch-specific patch-scale weights, ordered scale centers, and locality attention slopes. The resulting objective combines observation reconstruction, source-wise structural regularization, and modular auxiliary penalties for separation and scale specialization. We analyze the decoupling and coupling structure of the objective, the regularized exact-reconstruction fiber, and the reduction of permutation symmetry induced by ordered branch descriptors. A controlled case study shows that the learned branches converge to distinct temporal-scale structures and recover source-aligned latent trajectories under post-hoc evaluation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2605.25648

Genre: Research Report > Experimental Study (0.34)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Data Science (0.68)

Add feedback

An Elastic Shape Variational Autoencoder for Skeleton Pose Trajectories

Rahman, Arafat, Kumar, Shashwat, Barnes, Laura E., Srivastava, Anuj

arXiv.org Machine LearningMay-18-2026

Deep generative models provide flexible frameworks for modeling complex, structured data such as images, videos, 3D objects, and texts. However, when applied to sequences of human skeletons, standard variational autoencoders (VAEs) often allocate substantial capacity to nuisance factors-such as camera orientation, subject scale, viewpoint, and execution speed-rather than the intrinsic geometry of shapes and their motion. We propose the Elastic Shape - Variational Autoencoder (ES-VAE), a geometry-aware generative model for skeletal trajectories that leverages the transported square-root velocity field (TSRVF) representation on Kendall's shape manifold. This representation inherently removes rigid translations, rotations, and global scaling of shapes, and temporal rate variability of sequences, isolating the underlying shape dynamics. The ES-VAE encoder maps skeletal sequences to a low-dimensional latent space incorporating the Riemannian logarithm map, while the decoder reconstructs sequences using the corresponding exponential map. We demonstrate the effectiveness of ES-VAE on two datasets. First, we analyze skeletal gait cycles to predict clinical mobility scores and classify subjects into healthy and post-stroke groups. Second, we evaluate action recognition on the NTU RGB+D dataset. Across both settings, ES-VAE consistently outperforms standard VAEs and a range of sequence modeling baselines, including temporal convolutional networks, transformers, and graph convolutional networks. More broadly, ES-VAE provides a principled framework for learning generative models of longitudinal data on pose shape manifolds, offering improved latent representation and downstream performance compared to existing deep learning approaches.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2605.09231

Country: North America > United States > Virginia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Embedding Dimension Lower Bounds for Universality of Deep Sets and Janossy Pooling

Syed, Ali, Nambiar, Aditya, Siegel, Jonathan W.

arXiv.org Machine LearningMay-12-2026

In many practical applications it is important to build symmetries into neural network architectures. Consider the important case of permutation symmetry on point clouds consisting of $n$ points in $d$ dimensions. In this case the network learns a function on a set of $n$ points in $\mathbb{R}^d$, and a natural paradigm for constructing invariant networks is Janossy pooling, which generalizes the popular Deep Sets architecture. We study the universality of this approach, in particular the important question of how large the embedding dimension must be to guarantee universality of this architecture. Specifically, using a novel technique, we prove new lower bounds on the required size of this embedding dimension. For Deep Sets, this gives the correct minimal dimension up to a constant factor for all $d > 1$. For $k$-ary Janossy pooling, we prove the first non-trivial lower bound on the required embedding dimension when $k > 1$.

artificial intelligence, janossy, machine learning, (17 more...)

arXiv.org Machine Learning

2605.08377

Country: North America > United States > Texas > Brazos County > College Station (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Appendix - Scalable Bayesian GPFA with automatic relevance determination and discrete noise models AFurther analyses of preparatory dynamics in the primate reaching task max sim

Neural Information Processing SystemsApr-26-2026, 00:56:00 GMT

Here we briefly consider why introducing a prior over the factor matrix enables automatic relevance determination. These ideas reflect results by Bishop [1] and our experiments in Section 3.1. For simplicity, we will first consider the case of factor analysis where p(X) = Q d,tN(xdt; 0,1).

artificial intelligence, dimension, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback